37 research outputs found

    TAPER: query-aware, partition-enhancement for large, heterogenous, graphs

    Full text link
    Graph partitioning has long been seen as a viable approach to address Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given pattern matching queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by around 80%; given an unweighted METIS partitioning, by around 30%. These reductions are achieved within 8 iterations and with the additional advantage of being workload-aware and usable online.Comment: 12 pages, 11 figures, unpublishe

    Loom: Query-aware Partitioning of Online Graphs

    Full text link
    As with general graph processing systems, partitioning data over a cluster of machines improves the scalability of graph database management systems. However, these systems will incur additional network cost during the execution of a query workload, due to inter-partition traversals. Workload-agnostic partitioning algorithms typically minimise the likelihood of any edge crossing partition boundaries. However, these partitioners are sub-optimal with respect to many workloads, especially queries, which may require more frequent traversal of specific subsets of inter-partition edges. Furthermore, they largely unsuited to operating incrementally on dynamic, growing graphs. We present a new graph partitioning algorithm, Loom, that operates on a stream of graph updates and continuously allocates the new vertices and edges to partitions, taking into account a query workload of graph pattern expressions along with their relative frequencies. First we capture the most common patterns of edge traversals which occur when executing queries. We then compare sub-graphs, which present themselves incrementally in the graph update stream, against these common patterns. Finally we attempt to allocate each match to single partitions, reducing the number of inter-partition edges within frequently traversed sub-graphs and improving average query performance. Loom is extensively evaluated over several large test graphs with realistic query workloads and various orderings of the graph updates. We demonstrate that, given a workload, our prototype produces partitionings of significantly better quality than existing streaming graph partitioning algorithms Fennel and LDG

    Workload-sensitive approaches to improving graph data partitioning online

    Get PDF
    PhD ThesisMany modern applications, from social networks to network security tools, rely upon the graph data model, using it as part of an offline analytics pipeline or, increasingly, for storing and querying data online, e.g. in a graph database management system (GDBMS). Unfortunately, effective horizontal scaling of this graph data reduces to the NP-Hard problem of “k-way balanced graph partitioning”. Owing to the problem’s importance, several practical approaches exist, producing quality graph partitionings. However, these existing systems are unsuitable for partitioning online graphs, either introducing unnecessary network latency during query processing, being unable to efficiently adapt to changing data and query workloads, or both. In this thesis we propose partitioning techniques which are efficient and sensitive to given query workloads, suitable for application to online graphs and query workloads. To incrementally adapt partitionings in response to workload change, we propose TAPER: a graph repartitioner. TAPER uses novel datastructures to compute the probability of expensive inter -partition traversals (ipt) from each vertex, given the current workload of path queries. Subsequently, it iteratively adjusts an initial partitioning by swapping selected vertices amongst partitions, heuristically maintaining low ipt and high partition quality with respect to that workload. Iterations are inexpensive thanks to time and space optimisations in the underlying datastructures. To incrementally create partitionings in response to graph growth, we propose Loom: a streaming graph partitioner. Loom uses another novel datastructure to detect common patterns of edge traversals when executing a given workload of pattern matching queries. Subsequently, it employs a probabilistic graph isomorphism method to incrementally and efficiently compare sub-graphs in the stream of graph updates, to these common patterns. Matches are assigned within individual partitions if possible, thereby also reducing ipt and increasing partitioning quality w.r.t the given workload. - i - Both partitioner and repartitioner are extensively evaluated with real/synthetic graph datasets and query workloads. The headline results include that TAPER can reduce ipt by upto 80% over a naive existing partitioning and can maintain this reduction in the event of workload change, through additional iterations. Meanwhile, Loom reduces ipt by upto 40% over a state of the art streaming graph partitioner

    Estresse ocupacional e satisfação dos usuários com os cuidados de saúde primários em Portugal

    Get PDF
    The Portuguese primary healthcare sector has suffered changes due to a reform on the lines of the conceptual framework referred to by some authors as "New Public Management." These changes may be generating higher levels of occupational stress with a negative impact at individual and organizational levels. This study examines the experience of stress in 305 health professionals (physicians, nurses and clinical secretaries) and satisfaction with the services provided by them from 392 users. The population under scrutiny is taken from 10 type A and 10 type B Family Health Units (FHU). The results show that 84.2% of professionals report moderate to high levels of occupational stress with the nurses being those with higher levels. Users reported good levels of satisfaction, especially with the nursing services. There were no differences in stress level between type A and type B FHU, though there were at the level of user satisfaction of type B FHU users who show higher levels of satisfaction. It was seen that dimensions of user satisfaction were affected by stress related to excess work.info:eu-repo/semantics/publishedVersio

    Urban coral reefs: Degradation and resilience of hard coral assemblages in coastal cities of East and Southeast Asia

    Get PDF
    © 2018 The Author(s) Given predicted increases in urbanization in tropical and subtropical regions, understanding the processes shaping urban coral reefs may be essential for anticipating future conservation challenges. We used a case study approach to identify unifying patterns of urban coral reefs and clarify the effects of urbanization on hard coral assemblages. Data were compiled from 11 cities throughout East and Southeast Asia, with particular focus on Singapore, Jakarta, Hong Kong, and Naha (Okinawa). Our review highlights several key characteristics of urban coral reefs, including “reef compression” (a decline in bathymetric range with increasing turbidity and decreasing water clarity over time and relative to shore), dominance by domed coral growth forms and low reef complexity, variable city-specific inshore-offshore gradients, early declines in coral cover with recent fluctuating periods of acute impacts and rapid recovery, and colonization of urban infrastructure by hard corals. We present hypotheses for urban reef community dynamics and discuss potential of ecological engineering for corals in urban areas

    TAPER:query-aware, partition-enhancement for large, heterogenous graphs

    No full text
    Graph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given path queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by ∼ 80%; given an unweighted Metis partitioning, by ∼ 30%. These reductions are achieved within eight iterations and with the additional advantage of being workload-aware and usable online.</p

    Workload-aware streaming graph partitioning

    No full text
    Partitioning large graphs, in order to balance storage and processing costs across multiple physical machines, is becoming increasingly necessary as the typical scale of graph data continues to increase. A partitioning, however, may introduce query processing latency due to inter-partition communication overhead, especially if the query workload exhibits skew, frequently traversing a limited subset of graph edges. Existing partitioners are typically workload agnostic and susceptible to such skew; they minimise the likelihood of any edge crossing partition boundaries. We present our progress on LOOM: a streaming graph partitioner based upon efficient existing heuristics, which reduces inter-partition traversals when executing a stream of sub-graph pattern matching queries Q. We are able to continuously summarise the traversal patterns caused by queries within a window over Q. We do this using a generalisation over a trie data structure, which we call TPSTry++, to compactly encode frequent sub-graphs, or motifs, common to many query graphs in Q. When the graph-stream being partitioned contains a match for a motif, LOOM uses graph-stream pattern matching to capture it, and place it wholly within partition boundaries. This increases the likelihood that a random query q ∈ Q may be answered within a single partition, with no inter-partition communication to introduce additional latency. Finally, we discuss the potential pitfalls and drawbacks which exist with our approach, and detail the work yet to be completed.</p

    Workload-aware streaming graph partitioning

    No full text
    Partitioning large graphs, in order to balance storage and processing costs across multiple physical machines, is becoming increasingly necessary as the typical scale of graph data continues to increase. A partitioning, however, may introduce query processing latency due to inter-partition communication overhead, especially if the query workload exhibits skew, frequently traversing a limited subset of graph edges. Existing partitioners are typically workload agnostic and susceptible to such skew; they minimise the likelihood of any edge crossing partition boundaries. We present our progress on LOOM: a streaming graph partitioner based upon efficient existing heuristics, which reduces inter-partition traversals when executing a stream of sub-graph pattern matching queries Q. We are able to continuously summarise the traversal patterns caused by queries within a window over Q. We do this using a generalisation over a trie data structure, which we call TPSTry++, to compactly encode frequent sub-graphs, or motifs, common to many query graphs in Q. When the graph-stream being partitioned contains a match for a motif, LOOM uses graph-stream pattern matching to capture it, and place it wholly within partition boundaries. This increases the likelihood that a random query q ∈ Q may be answered within a single partition, with no inter-partition communication to introduce additional latency. Finally, we discuss the potential pitfalls and drawbacks which exist with our approach, and detail the work yet to be completed.</p

    TAPER:query-aware, partition-enhancement for large, heterogenous graphs

    No full text
    Graph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given path queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by ∼ 80%; given an unweighted Metis partitioning, by ∼ 30%. These reductions are achieved within eight iterations and with the additional advantage of being workload-aware and usable online.</p

    Provgen:Generating synthetic PROV graphs with predictable structure

    No full text
    This paper introduces provGen, a generator aimed at producing large synthetic provenance graphs with predictable properties and of arbitrary size. Synthetic provenance graphs serve two main purposes. Firstly, they provide a variety of controlled workloads that can be used to test storage and query capabilities of provenance management systems at scale. Secondly, they provide challenging testbeds for experimenting with graph algorithms for provenance analytics, an area of increasing research interest. provGen produces PROV graphs and stores them in a graph DBMS (Neo4J). A key feature is to let users control the relationship makeup and topological features of the graph, by providing a seed provenance pattern along with a set of constraints, expressed using a custom Domain Specific Language. We also propose a simple method for evaluating the quality of the generated graphs, by measuring how realistically they simulate the structure of real-world patterns.</p
    corecore